Phonology-Augmented Statistical Framework for Machine Transliteration Using Limited Linguistic Resources
نویسندگان
چکیده
منابع مشابه
Phonology-augmented statistical transliteration for low-resource languages
Transliteration converts words in a source language (e.g., English) into phonetically equivalent words in a target language (e.g., Vietnamese). This conversion needs to take into account phonology of the target language, which are rules determining how phonemes can be organized. For example, a transliterated word in Vietnamese that begins with a consonant cluster is phonologically invalid. Whil...
متن کاملStatistical Machine Translation: Rapid Development with Limited Resources
We describe an experiment in rapid development of a statistical machine translation (SMT) system from scratch, using limited resources: under this heading we include not only training data, but also computing power, linguistic knowledge, programming effort, and absolute time.
متن کاملUsing Linguistic Knowledge in Statistical Machine Translation
In this thesis, we present methods for using linguistically motivated information to enhance the performance of statistical machine translation (SMT). One of the advantages of the statistical approach to machine translation is that it is largely languageagnostic. Machine learning models are used to automatically learn translation patterns from data. SMT can, however, be improved by using lingui...
متن کاملTajik-Farsi Persian Transliteration Using Statistical Machine Translation
Tajik Persian is a dialect of Persian spoken primarily in Tajikistan and written with a modified Cyrillic alphabet. Iranian Persian, or Farsi, as it is natively called, is the lingua franca of Iran and is written with the Persian alphabet, a modified Arabic script. Although the spoken versions of Tajik and Farsi are mutually intelligible to educated speakers of both languages, the difference be...
متن کاملTransliteration by Bidirectional Statistical Machine Translation
The system presented in this paper uses phrase-based statistical machine translation (SMT) techniques to directly transliterate between all language pairs in this shared task. The technique makes no language specific assumptions, uses no dictionaries or explicit phonetic information. The translation process transforms sequences of tokens in the source language directly into to sequences of toke...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing
سال: 2019
ISSN: 2329-9290,2329-9304
DOI: 10.1109/taslp.2018.2875269